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Abstract. The latent block model (LBM) is a flexible probabilistic tool 
to describe interactions between node sets in bipartite networks, but it 
does not account for interactions of time varying intensity between nodes 
in unknown classes. In this paper we propose a non stationary temporal 
extension of the LBM that clusters simultaneously the two node sets of a 
bipartite network and constructs classes of time intervals on which inter¬ 
actions are stationary. The number of clusters as well as the membership 
to classes are obtained by maximizing the exact complete-data integrated 
likelihood relying on a greedy search approach. Experiments on simulated 
and real data are carried out in order to assess the proposed methodology. 


1 Introduction 

Since the interactions between nodes of a network generally have a time varying 
intensity, the network has a non trivial time structure that we aim at inferring. 
The approach we follow to introduce a temporal dimension, consists in partition¬ 
ing the entire time horizon, during which we observe interactions, in disjoint time 
intervals, having an arbitrary fixed length. Then, we simultaneously cluster the 
nodes of the bipartite network and these time intervals, assuming interactions 
are generated by a latent block model. A similar view is adopted by Randria- 
manamihaga, Come and Govaert |T], nonetheless with a substantial difference: 
they consider time intervals whose membership is not hidden but known in ad¬ 
vance and hence exogenous, whereas in this paper we infer the membership of 
each interval by maximizing a likelihood criterion. A similar task is accom¬ 
plished by Guigoures, Boulle and Rossi [3] in a different point of view: they 
do not consider fixed-length time intervals, but associate a time stamp to each 
interaction “in order to build time segments and clusters of nodes whose edge 
distributions are similar and evolve in the same way over the time segments”. 
In order to obtain the optimal number of time and nodes clusters, we maximize 
the integrated complete-data likelihood (ICL) using a greedy search in a very 
similar fashion as in Wyse, Frial and Latouche [4]. This paper is structured as 
follows: in Section 2 we present the classical LBM and detail the time extension 
proposed. In Section 3 we derive the ICL for this model and in Section 4 we 
discuss the experiments we conducted with both simulated and real data. The 
Section 5 concludes the paper. 


2 A non stationary latent block model 

We present here the LBM (Holland et al. [2j), as described in Wyse, Friel, 
Latouche (2014). Two sets of nodes are considered: A = {ai,... ,ajvj and 
B = {&i,..., &m}- Undirected links between node i from A and node j from 
B , are counted by the observed variable X t j, being the component (i,j) of the 
N x M adjacency matrix X = {Xij}i< Nt j< M . Nodes in A and B are clustered 
in I\ and G disjointed subgroups respectively: 

A = Uk<xA k , AnHj=0, Vi ± j 

and similarly for B. Nodes in the same cluster in A have linking attributes of the 
same nature to clusters of B. We introduce two hidden vectors c = {ci,..., cn} 
and w = {u>;i...., Wm} labeling each node’s membership: 

Ci = k iff ai G A k , Vfc < K and Wj = g iff bj G B g , \/g < G. 

In order to introduce the temporal dimension, consider now a sequence of equally 
spaced, adjacent time steps {A. u := t u — t u -i\ u <u over the interval [0, T] and a 
partition C±,... ,Cd of the same intervaQ We introduce furthermore a random 
vector y = {y u } u <u, such that y u =d if and only if I u :=]i u _i, t u ] G C d , Vd < D. 
We attach to y a multinomial distribution: 

p(y\P,D)= n 

d<D 

where \Cd\ = #{A : I u G Cd}- Now we define N-f as the number of observed 
connections between and bj , in the time interval I u and we make the following 
crucial assumption: 

p(N-f\ci = k,Wj = g,y u = d) follows a Poisson (A u X kgd ), (1) 

hence the number of interactions is conditionally distributed like a Poisson ran¬ 
dom variable with parameter depending on k,g,d (A u is constant). 

Notation: In the following, for seek of simplicity, we will note: 

n = n n n &nd n = n 

k,g,d k<.K g<.G d<.D Ci i:a=k 

and similarly for ]([ and . 

The adjacency matrix, noted N A , has three dimensions (N x M x U) and its 
observed likelihood can be computed explicitly: 

p(N A \A,c,w,y,K,G,D ) = ————- e - A ^ R ^\ s ^ d \ (2) 

k,g,d LLci 11 wj 11 y u ^ij ' 


1 T and U are linked by the following relation: T = A U U. 




where we noted S kgd := J2 Ci Yj Wj J2 Vu N lj and R k g d ■= \A k \\B g \\C d \ and the 
subscript u was removed from A u to emphasize that time steps are equally 
spaced for every u. 

Since c,w and y are not known, a multinomial factorizing probability density 
p( c, w, y|<3>, K, G, D ), depending on hyperparameter $, is introduced. The joint 
distribution of labels looks finally as follows: 


p(c,w,y\$,K,G,D) 



where $ = {u;,p, (3}. 


3 Exact ICL for non stationary LBM 


(3) 


3.1 Exact ICL derivation 

The integrated classification criterion (ICL) was introduced as a model selection 
criterion in the context of Gaussian mixture models by Biernacky et al. (5J. 
Come and Latouche ( 6 ] proposed an exact version of the ICL based on a Bayesian 
approach for the stochastic block model and Wyse, Friel and Latouche [1] applied 
the exact ICL to select the number of clusters in a bipartite network using an 
LBM model. This is the approach we follow here. The quantity we focus on is the 
complete data log-likelihood, integrated with respect to the model parameters <I> 
and A = {X kgd } k <K, g <G,d<D- 

ICC = log (/ p(N A ,c } w, y, A, $|A', G, D)dAd$'j . (4) 

Introducing a prior distribution i/(A, <h| K,G,D) over the pair $,A and thanks 
to ad hoc independence assumptions, the ICL can be rewritten as follows: 

ICC = log (v(N a \c, w, y, K, G,D)) + log (u(c, w, y| K, G, D)). (5) 

The choice of prior distributions over the model parameters is crucial to have an 
explicit form of the ICL. 


3.2 A priori distributions 

We consider the conjugate prior distributions. Thus we impose a Gamma a 
priori over A: 

i a kgd 

v(\ kgd \a kgdl b k 9 d) = f^ X Zd~ le ~ bk9dXk9d 

and a factorizing Dirichlet a priori distribution to <k: 

K, G, D) = Dir K (u;; a ,..., a) x Dir G (p; 6 ,..., 6) x Dir z? (/3; 7 ,..., 7 ). 



It can be proven that the two terms in ([5]), reduce to: 
v(N A \c,w,y,K,G,D) = [] 


bT a fA Sk » d 

kga 


k,g,d ^( a kgd) ]l Ci 11™., Yl Vu 

r (Sfcgd 4* dkgd') 

[A R kgd + b kg d\ Skgd+ak9d 


( 6 ) 


and: 


, r(aA') ]Ifc<x r (l A fcl + °0 T(SG) rig<G r 0 5 ffl +<^) 

Kc,w,y| K,G,D) = r M * r(jv + a K) x r«)° t(M + iG) 

t( 7 d) U d <D rflCkl + 7) 


T( 7 ) c T{U + jD) 


(7) 


3.3 ICL Maximization 

In order to maximize the integrated complete likelihood (ICL) in equation ([5]) 
with respect to the six unknowns c, w, y, K, G, D , we rely on a greedy search over 
labels and the number of nodes and time clusters. This approach is described in 
Wyse, Frial and Latouche [|] for a stationary latent block model. 


4 Experiments 

4.1 Simulated data 

Some experiments on simulated data have initially been conducted. Based on 
the model described in Section 2, we simulated interactions between 50 source 
nodes and 50 destination nodes, both clustered in three groups (AT, G = 3). 
Interactions take place into 24 time intervals of unitary length (ideally one hour), 
clustered into three groups too (D = 3). Nodes and time intervals labels are 
sampled from multinomial distributions, whose hyperparameters (uj,p,S) have 
all been set equal to {1/3,1/3,1/3}. With these settings, we consider 27 different 
Poisson parameters (As) generating connections between nodes. The generative 
model used to produce them is described by: 

A kgi = si[fc] + S 2 [g\ + S 3 [Z], k,g,l G { 1,2,3} 

where: 

sr = [0,2,4] s 2 = [0.5,1,1.5] s 3 = [0.5,1,1.5] 

ans .Si [k] denotes the k-th component of s-|. Similarly for S 2 and S 3 . The greedy 
search algorithm we coded was able to exactly recover these initial settings, con¬ 
verging to the true ICL of —122410. Other experiments were run with different 
values inside vectors Si,S 2 ,S 3 . Not surprisingly the more nuanced differences 
between As are, the more difficult it is for the algorithm to converge to the true 
value of the ICL0 

2 Greedy search algorithms are path dependent and they could converge to local maxima. 








4.2 Real Data 


The dataset we used was collected during the ACM Hypertext conference 
held in Turin, June 29th - July 1st, 2009. Conference attendees volunteered to 
to wear radio badges that monitored their face-to-face proximity. The dataset 
represents the dynamical network of face-to-face proximity of 113 conference 
attendees over about 2.5 daysj^] Further details can be found in Isella, Stehle, 
Barrat, Cattuto, Pinton, Van den Broeck [7]. We focused on the first conference 
day, namely the twenty four hours going from 8 am of June 29th to 7.59am of 
June 30th. The day was partitioned in small time intervals of 20 seconds in the 
original data frame. We considered 15 minutes time aggregations, thus leading 
to a partition of the day made of 96 consecutive quarter-hours (U = 96 with 
previous notation). A typical row of the adjacency matrix we analyzed, looks 
like: 


Person 1 

Person 2 

Time Interval (15m) 

Number of interactions 

52 

26 

5 

16 


It means that conference attendees 52 and 26, between 9am and 9.15am have 
spoken for 16 x 20s ss 5m30s. 

The greedy search algorithm converged to a final ICL of -53217.4, corresponding 
to 23 clusters for nodes (people) and 3 time clusters. In Figure (laI we show how 
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(a) Clustered time intervals. 


(b) Connections for every time interval. 


Fig. 1: The aggregated connections for every time interval (lb I and time clusters found 


by our model (laI are compared. 


daily quarter-hours are assigned to each cluster: the class C\ contains intervals 
marked by a weaker intensity of interactions (on average), whereas intervals in¬ 
side C 3 are characterized by the highest intensity of interactions. This can either 
be verified analytically by averaging estimated Poisson intensities for each one 
of the three clusters or graphically by looking at Figure (lb). In this Figure we 


3 More informations can be found at: 

http: //www.sociopatterns. org/datasets/hypertext-2009-dynamic-contact-network/ 


















computed the total number of interactions between conference attendees for each 
quarter-hour and it can clearly be seen how time intervals corresponding to the 
higher number of interactions have been placed in cluster C 3 , those correspond¬ 
ing to an intermediate interaction intensity, in C 2 and so on. It is interesting 
to remark how the model can quite closely recover times of social gathering like 
the lunch break (13.00-15.00) or the “wine and cheese reception” (18.00-19.00). 
A complete program of the day can be found at: 
http://www.ht2009.org/program.php 

5 Conclusions 

We proposed a non-stationary evolution of the latent block model (LBM) al¬ 
lowing us to simultaneously infer the time structure of a bipartite network and 
cluster the two node sets. The approach we chose consists in partitioning the 
entire time horizon in fixed-length time intervals to be clustered on the basis of 
the intensity of connections in each interval. We derived the complete ICL for 
such a model and maximized it numerically, by means of a greedy search, for two 
different networks: a simulated and a real one. The results of these two tests 
highlight the capacity of the model to capture non-stationary time structures. 
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